Search CORE

7 research outputs found

Correction: Splice site identification using probabilistic parameters and SVM classification

Author: Baten AKMA
Chang BCH
Halgamuge SK
Li Jason
Publication venue: BioMed Central
Publication date: 18/12/2006
Field of study

BACKGROUND: Recent advances and automation in DNA sequencing technology has created a vast amount of DNA sequence data. This increasing growth of sequence data demands better and efficient analysis methods. Identifying genes in this newly accumulated data is an important issue in bioinformatics, and it requires the prediction of the complete gene structure. Accurate identification of splice sites in DNA sequences plays one of the central roles of gene structural prediction in eukaryotes. Effective detection of splice sites requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the splice site surrounding region. A higher-order Markov model is generally regarded as a useful technique for modeling higher-order dependencies. However, their implementation requires estimating a large number of parameters, which is computationally expensive. RESULTS: The proposed method for splice site detection consists of two stages: a first order Markov model (MM1) is used in the first stage and a support vector machine (SVM) with polynomial kernel is used in the second stage. The MM1 serves as a pre-processing step for the SVM and takes DNA sequences as its input. It models the compositional features and dependencies of nucleotides in terms of probabilistic parameters around splice site regions. The probabilistic parameters are then fed into the SVM, which combines them nonlinearly to predict splice sites. When the proposed MM1-SVM model is compared with other existing standard splice site detection methods, it shows a superior performance in all the cases. CONCLUSION: We proposed an effective pre-processing scheme for the SVM and applied it for the identification of splice sites. This is a simple yet effective splice site detection method, which shows a better classification accuracy and computational speed than some other more complex methods

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Fast splice site detection using information content and feature reduction

Author: AKMA Baten
AKMA Baten
BCH Chang
C Burge
C Burge
C Cortes
CE Shannon
D Cai
G Dror
G Ratsch
G Yeo
H Drucker
H Itoh
H Liu
JCaHLS Rajapakse
JSaRD Chuang
L Zhang
M Burset
M Pertea
M Zhang
MB Shapiro
MG Reese
MG Reese
N Cristianini
P Waddell
R Castelo
S Brunak
S Buckingham
S Degroeve
S Salzberg
S Sonnenburg
S Sonnenburg
S Washietl
SA Marashi
SK Halgamuge
SM Hebsgaard
T Golub
T-M Chen
TD Schneider
v Vapnik
XH-F Zhang
Y Saeys
YF Sun
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: Accurate identification of splice sites in DNA sequences plays a key role in the prediction of gene structure in eukaryotes. Already many computational methods have been proposed for the detection of splice sites and some of them showed high prediction accuracy. However, most of these methods are limited in terms of their long computation time when applied to whole genome sequence data. Results: In this paper we propose a hybrid algorithm which combines several effective and informative input features with the state of the art support vector machine (SVM). To obtain the input features we employ information content method based on Shannon\u27s information theory, Shapiro\u27s score scheme, and Markovian probabilities. We also use a feature elimination scheme to reduce the less informative features from the input data. Conclusion: In this study we propose a new feature based splice site detection method that shows improved acceptor and donor splice site detection in DNA sequences when the performance is compared with various state of the art and well known method

ePublications@SCU

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Establishing bioinformatics research in the Asia Pacific

Author: A Christoffels
A Konagaya
A Suresh
AKMA Baten
AM Khan
AR Sikder
CY Lin
D Gilbert
H Sugawara
HH Lin
J Sprenger
JC Tong
LJK Wee
M Brahmachary
Martti Tammi
Michael Gribskov
R Thadani
RTH Tsai
S Bhattacharya
S Foret
S Mathivanan
S Miyano
S Ranganathan
S Ranjan
S Takasaki
Shoba Ranganathan
Tin Wee Tan
U Kulkarni-Kale
X Wu
YP Lim
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5(th )annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand) and Busan (South Korea). This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community

Crossref

Springer - Publisher Connector

PubMed Central

Purdue E-Pubs

Macquarie University ResearchOnline

ScholarBank@NUS

Emerging strengths in Asia Pacific bioinformatics

Author: AKMA Baten
C Ngamphiw
CC Lu
CK Liu
CW Cheng
D Gilbert
DH Tran
DTH Chang
DTH Chang
H Sugawara
HH Lin
J Gaikwad
J Lim
JO Yang
KH Choo
PC Hsu
RTH Tsai
S Miyano
S Ranganathan
S Ranganathan
S Ranganathan
S Ranganathan
SA Lee
Shoba Ranganathan
SJ Lim
Tin Wee Tan
U Gowthaman
U Sangket
Ueng-Cheng Yang
W Tongsima
Wen-Lian Hsu
WH Wu
WS Leung
X Wang
Y Mizuno
Y Yang
YP Lim
YS Lee 3
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20–23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts

Crossref

Springer - Publisher Connector

PubMed Central

Macquarie University ResearchOnline

ScholarBank@NUS

Biological Sequence Data Preprocessing for Classification: A Case Study in Splice Site Identification

Author: Baten AKMA
Chang B
Halgamuge SK
Wickramarachchi N
Publication venue
Publication date: 21/10/2013
Field of study

The increasing growth of biological sequence data demands better and efficient analysis methods. Effective detection of various regulatory signals in these sequences requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the surrounding region of the regulatory signals. A higher order Markov model is generally regarded as a useful technique for modeling higher order dependencies of the nucleotides. However, its implementation requires estimating a large number of computationally expensive parameters. In this paper, we propose a hybrid method consisting of a first order Markov model for sequence data preprocessing and a multilayer perceptron neural network for classification. The Markov model captures the compositional features and dependencies of nucleotides in terms of probabilistic parameters which are used as inputs to the classifier. The classifier combines the Markov probabilities nonlinearly for signal detection. When applied to the splice site detection problem using three widely used data sets, it is observed that the proposed hybrid method is able to model higher order dependencies with better classification accuracies

Digital Repository, University of Moratuwa

Splice site identification using probabilistic parameters and SVM classification

Author: Baten AKMA
Chang BCH
Halgamuge SK
Li Jason
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2006
Field of study

Abstract Background Recent advances and automation in DNA sequencing technology has created a vast amount of DNA sequence data. This increasing growth of sequence data demands better and efficient analysis methods. Identifying genes in this newly accumulated data is an important issue in bioinformatics, and it requires the prediction of the complete gene structure. Accurate identification of splice sites in DNA sequences plays one of the central roles of gene structural prediction in eukaryotes. Effective detection of splice sites requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the splice site surrounding region. A higher-order Markov model is generally regarded as a useful technique for modeling higher-order dependencies. However, their implementation requires estimating a large number of parameters, which is computationally expensive. Results The proposed method for splice site detection consists of two stages: a first order Markov model (MM1) is used in the first stage and a support vector machine (SVM) with polynomial kernel is used in the second stage. The MM1 serves as a pre-processing step for the SVM and takes DNA sequences as its input. It models the compositional features and dependencies of nucleotides in terms of probabilistic parameters around splice site regions. The probabilistic parameters are then fed into the SVM, which combines them nonlinearly to predict splice sites. When the proposed MM1-SVM model is compared with other existing standard splice site detection methods, it shows a superior performance in all the cases. Conclusion We proposed an effective pre-processing scheme for the SVM and applied it for the identification of splice sites. This is a simple yet effective splice site detection method, which shows a better classification accuracy and computational speed than some other more complex methods.</p

Directory of Open Access Journals

Biocomplexity as a Challenge for Biological Theory

Author: AKMA Baten
AS Ribeiro
B Lobitz
DL Hull
EH Davidson
ER Dougherty
K Strange
KY Yeung
Manfred D. Laubichler
U Krohs
WC Salmon
Werner Callebaut
WK Michener
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref